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(54) Multi-mode MPEG decoder 

(57) A television receiver with an MPEG decoder is 
configurable for full high definition decoding and display, 
or reduced cost lower definition display. The MPEG 
decoder (10-33) uses a controllable dual-mode data 
reduction network selectively employing horizontal 
detail reduction (29) and data re-compression (30) 
between the decoder and the decoder frame memory 
(20) from which image information to be displayed (27) 



is derived. The amount of data reduction is manufac- 
turer selected in accordance with the resolution of the 
display device, e.g., equal to or less than high definition 
resolution. The frame memory size is also manufacturer 
selected in accordance with the resolution of the display 
device. 
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Figure 2 depicts a memory mapping procedure. 

Figure 3 is a block diagram of a compression network useful in the MPEG decoder of Figure 1 . 
Figures 4 and 5 show additional details of the network of Figure 3. 

Figures 6 and 7 depict pixel arrangements helpful in understanding aspects of the operation of the networks shown 
5 in Figures 4 and 5. 

Figure 8 depicts an alternative dual path compression network. 
Figure 9 depicts pixel decimation and upsampling. 

Figure 10 is a block diagram of apparatus for performing the process depicted in Figure 9. 
Figure 1 1 is a block diagram illustrating display buffering of pixels from memory to a display processor. 
jo Figure 12 depicts the arrangement of Figure 1 in the context of a simplified practical receiver. 

Figure 1 depicts a portion of a digital video signal processor such as may be found in a television receiver for 
processing an input high definition video signal. TTie processor may be included in an integrated circuit which includes 
provision for receiving and processing standard definition video signals via an analog channel. The video processor 
includes a conventional MPEG decoder* constituted by blocks 10, 12, 14, 16, 18, 20 and 22. An MPEG encoder and 
is decoder are described, for example, by Ang et al., "Video Compression Makes Big Gains," IEEE Spectrum, October 
1991. 

The system of Figure 1 receives a controlled datastream of MPEG coded compressed data from a preceding input 
processor, e.g., a transport decoder, which separates data packets after input signal demodulation. In this example the 
received input datastream represents high definition image material (1920 x 1088) as specified in the Grand Alliance 

20 specification for the United States high definition terrestrial television broadcast system. The input datastream is in the 
form of data blocks representing 8x8 pixels (picture elements). This data represents compressed, coded intraframe and 
interframe information. The intraframe information comprises l-frame anchor frames. The interframe information com- 
prises predictive motion coded residual image information representing the image difference between adjacent picture 
frames. The interframe motion coding involves generating motion vectors that represent the offset between a current 

25 block being processed and a block in a prior reconstructed image. The motion vector which represents the best match 
between the current and prior blocks is coded and transmitted. Also, the difference (residual) between each motion 
compensated 8x8 block and the prior reconstructed block is DCT transformed, quantized and variable length coded 
before being transmitted. This motion compensated coding process is described in greater detail in various publications 
including the Ang, et al. article mentioned above. 

30 The MPEG decoder exhibits reduced memory operating modes which allow a significant reduction in the amount 
of memory required to decode high definition image sequences in reduced cost receivers. As will be explained subse- 
quently, these modes involve compressing video frames to be stored in memory and selectively horizontally filtering and 
decimating pixel data within the decoder loop. For example, in one mode the system provides anchor frame compres- 
sion. In another mode the system provides compression after horizontal detail reduction by low pass filtering and down- 

35 sampling. Block compression may be used without decimation, but horizontal decimation without decimation is not a 
recommended practice for this system. Although both compression and decimation produce memory reduction by a 
factor of two, compression produces better pictures than horizontal decimation. Any processing (e.g., compression and 
decimation) in the decoder loop may produce artifacts. Decimation prior to compression is preferable, but in some sys- 
tems compression may precede decimation. 

40 The input compressed pixel data blocks are buffered by unit 10 before being variable length decoded by unit 12. 
Buffer 1 0 exhibits a storage capacity of 1 .75 Mbits in the case of a main level, main profile MPEG datastream. Decoded 
compressed data from unit 12 is decompressed by inverse quantization unit 14 and by inverse discrete cosine transfor- 
mation (DCT) unit 16 before being applied to one input of an adder 18. It is noted that unit 16 employs full inverse DCT 
processing. No DCT coefficients are discarded since the present inventors consider this to be an unacceptable filtering 

45 technique, e.g., for reducing the DCT computational load. Filtering before decimation (as shown in Figure 10) is pre- 
ferred. Dropping DCT coefficients, which is similar to horizontal and vertical decimation, is a crude form of compression 
and is not equivalent to filtering, and makes it difficult or impossible to filter properly 

The quantization step size of inverse quantizer 14 is controlled by a signal from buffer 10 to assure a smooth data 
flow. Decoded motion vectors are provided from decoder 12 to a motion compensation unit 22 as will be discussed 

so below. Decoder 12 also produces an inter/intra frame mode select control signal, as known, which is not shown to sim- 
plify the drawing. The operations performed by units 12, 14 and 16 are the inverse of corresponding operations per- 
formed by an encoder at a transmitter. The MPEG decoder of Figure 1 reconstitutes the received image using known 
MPEG processing techniques which are described briefly below. 

A reconstructed pixel block is provided at the output of adder 18 by summing the residual image data from unit 16 

55 with predicted image data provided at the output of motion compensation unit 22 based on the contents of video frame 
memory 20. When an entire frame of pixel blocks has been processed, the resulting reconstructed image is stored in 
frame memory 20. In the interframe mode, motion vectors obtained from decoder 12 are used to provide the location of 
the predicted blocks from unit 22. 

The image reconstruction process involving adder 18, memory 20 and motion compensation unit 22 advanta- 
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definition MPEG decoding. A data bus of this width is within the capability of current technology, and requires a con- 
servative operating speed of 40 MHz. In this example the bi-directional external memory bus which connects the output 
of multiplexer 31 to memory 20 has an available bit width of 96 bits, of which programmable bit widths of 96, 64, 48 or 
less are used for data depending on the receiver operating configuration as discussed above. 

5 The interface between the external memory bus and the internal memory bus is effected by using multiplexer 31 to 
translate from the internal memory bus to the external memory bus. Access to memory 20 is defined in terms of integer 
multiples of 192 bits. Depending on the receiver configuration with respect to the different levels of image quality as 
mentioned previously, data to be written to memory 20 from compressor 30 is demultiplexed by unit 31 from 192 bits to 
the target width of the external memory bus (96, 64, 48, or 32 bits). Data to be read from memory 20 to decompressor 

10 32 is multiplexed by unit 31 from the external bus width to the 192 bit internal bus width. 

Depending on the receiver configuration, different amounts of system bandwidth are required to support the asso- 
ciated resolution of a displayed image. Greater bandwidth is achieved by using wider data paths. Thus different memory 
data path widths are required for different system configurations and image resolution. Since the internal memory bus 
data path is an integer multiple of the (external) memory bus data path, the clock rate for the internal memory path is 

15 always less than the clock rate for the external memory path. An internal data word can always be constructed from an 
integer number of external data words. Similarly, an integer number of external data words can be generated from an 
internal data word. 

As will be discussed in connection with Figure 12, units 29-34 are controlled by a local microprocessor depending 
on whether or not the MPEG decoder is situated in a high definition receiver or a receiver with somewhat reduced res- 

20 olution. The microprocessor is programmed to determine the amount of compression performed by unit 30, and 
whether or not decimator 29 is enabled (to downsample data) or bypassed (to convey data from adder 1 8 to compressor 
30 without downsampling). The microprocessor also instructs multiplexer 31 to select, from an available 96 bit wide 
memory path, the memory data path width required for a particular receiver configuration, e.g., a 96, 64 or less bit wide 
path. The system provides full high definition MPEG decoding without memory reduction by using appropriate software 

25 control mechanisms to disable or bypass the decimation and compression functions. 

A pictorial representation of the reduced memory requirements of memory device 20 is shown in Figure 2. To sim- 
plify the discussion the following description is given in the context of compression by unit 30 alone. In Figure 2. the 
memory map on the left represents a mapping of pixel blocks within a full size memory. The map on the right illustrates 
how a 50% smaller memory is used to store blocks compressed by unit 30. As will be seen from the following discussion 

30 of the compression network shown in Figure 3, each block (e.g., block C) is guaranteed to fit within 50% of the space 
normally required for a full size memory, or less. That is, the compression provided by unit 30 is 50% or more. In this 
example any unused memory space remaining after compression is left unused so that the starting position of the data 
for any block is a known location, or starting address. 

In the full size memory, any particular pixel can be located and accessed because of a fixed mapping between the 

35 video frame pixels and the memory pixel addresses. The reduced size memory does not exhibit pixel-by-pixel mapping. 
Instead, pixel blocks are mapped into the memory. If a particular pixel from within a particular block is needed, it may 
be necessary to access the entire block of data. Any memory space not needed for MPEG decoding is available for 
other purposes such as on-screen display, microprocessor RAM, transport buffers or other special buffers, for example. 
Referring back to Figure 1 , the use of compressor 30 prior to storing data in memory 20 requires that data be 

40 decompressed prior to unit 22 in the motion compensation processing loop. This is accomplished by block-based 
decompressor 32, which exhibits the inverse of the operation of compressor 30. Block-based display decompressor 34 
is similar to unit 32 and decompresses stored pixel blocks before being conveyed to a display processor 26. Processor 
26 may include, for example, an NTSC coding network, circuits for conditioning the pixel data for display, and a display 
driver network for providing video signals to an image reproducing device 27, e.g., a kinescope. Similarly, when down- 

45 sampling unit 29 is enabled prior to memory 20, data from memory 20 is upsampled prior to unit 22 in the motion com- 
pensation processing loop. This is accomplished by horizontal upsampling unit 33, which exhibits the inverse of the 
operation of unit 29. Display device 27 may exhibit full high definition image resolution. Alternatively, a less expensive 
image display device with less than full high definition image resolution may be used in a more economical receiver 
design, in which case data reduction network 29, 30 is programmed and the size of memory 20 is chosen as discussed 

so previously. 

Data from stored anchor frames such as I frames are generally accessed in a random fashion according to the 
motion vectors received in the input compressed data stream. A block based compression scheme maintains reason- 
able accessibility of pixel data from the frame memory. An 8 x 8 pixel block has been found to work well with the dis- 
closed compression scheme. Larger pixel blocks allow the use of sophisticated compression techniques at the expense 
55 of reduced pixel accessibility. Smaller blocks allow finer granularity in accessing pixels at the expense of fewer options 
for compression. Various types of compression, including quantization and transformation, may be used to implement 
the function of compressor 30 depending on the requirements of a particular system. 

The type of compression used should preferably, but not necessarily, exhibit certain characteristics. Each block 
should be compressed a predetermined amount (or more in some systems) so that the location of each compressed 
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316 is successful in achieving greater than the target compression amount, the output of compressor 316 is used and 
some of the reserved memory space is not used by the compressed block data. That is, each compressed block begins 
to fill its reserved memory area beginning with a predetermined starting address and continuing to an address less than 
the last address reserved for that block. This process is discussed in connection with Figure 2. 

5 It is desirable for block based compression to be capable of achieving both high compression efficiency and easy 
access to each pixel of a pixel block, even though these two results are competing in nature. That is, high compression 
efficiency requires a large block size, while easy access to pixels requires a small block size. It has been found that both 
of these characteristics can be substantially achieved with pixel block sizes of 8x8 pixels and 16x4 pixels. The blocks 
are formed into the required NxN pixel sizes in unit 10 as mentioned previously. 

10 In this example each field based pixel block is scanned in a raster manner as shown in Figure 6, from left to right 
in a downward direction. This scanning is done in both units 31 6 and 322 using delay elements 452-456 and delay ele- 
ments 552-556 as shown in Figures 4 and 5 respectively, as will be discussed. The variable compression network is 
shown in Figure 4. This network uses a DPCM loop with adaptive prediction to produce a difference signal (residual) 
using known techniques. This difference is variable length coded, and the resulting number of coded difference bits is 

is monitored to indicate whether or not the desired compression factor was achieved for the current block. 

In Figure 4, differencing network 442 produces an output representing the difference (residual) between input pixel 
values applied to a non-inverting input (+) of unit 442 and predicted pixel values applied to an inverting input (-) of unit 
442, respectively. The predicted value is obtained using a DPCM processing loop including differencer 442, variable 
length coder 444 and a variable length decoder 446 which performs the inverse of the coding operation performed by 

20 unit 444. The variable length coder can include an optional high resolution quantizer and an entropy encoder (e.g., a 
Huffman coder) for lossless or near lossless compression. The variable length decoder includes an inverse quantizer 
and entropy decoder. The inversely decoded output from unit 446 is summed in a unit 448 with an output from a pre- 
diction network including a predictor 450 and associated pixel delay elements 452, 454 and 456. These elements pro- 
vide delays of one. seven and one pixels, respectively. A predicted pixel value output from unit 450 is applied to inputs 

25 of adder 448 and differencer 442. 

Figure 7 shows an exemplary arrangement of a group of four pixels A, B, C and X (the pixel to be predicted) asso- 
ciated with the predictive processing and coding operation of the DPCM network. This group of pixels is also referenced 
in the pixel block shown in Figure 6. In this example pixel B is delayed by a one pixel interval relative to pixel C, pixel A 
is delayed by a seven pixel interval relative to pixel B, and pixel X is delayed one pixel interval relative to pixel A. The 

30 DPCM prediction process is well-known and will be discussed subsequently. Compressed pixel data from the output of 
variable length coder 444 are buffered by a unit 460 before being provided to MUX 325 of Figure 3. Buffer 460 stores 
the output of the variable compression process until the entire block has been processed, at which time it can be deter- 
mined whether or not the target compression factor has been reached. 

The bit count of each compressed block output from coder 444 is monitored by bit counter 418, which may be 

35 implemented by any of several known techniques. After each pixel block has been variably compressed, counter 418 
provides a Control output signal if the compressed bit count is at or below a predetermined threshold, indicating that the 
desired amount of compression has been reached or exceeded by the variable compressor. This Control signal is 
applied to the switching control input of MUX 325 for causing MUX 325 to convey the output from the variable length 
compressor to the utilization network. Otherwise, the compressed block output (for the same pixel block) from the fixed 

40 length compressor is conveyed to the utilization network. 

The fixed compression network is shown in Figure 5. This network also uses a DPCM loop with adaptive prediction, 
as in the case of the variable compressor. In Figure 5 elements 548, 550, 552, 552, 554 and 556 perform the same func- 
tions as corresponding elements in Figure 4. Differencing network 542 serves the same purpose as unit 442 in Figure 
4 for producing a residual pixel value, but in a slightly different context as discussed below. 

45 The fixed compression network uses non-linear quantizing of the difference (residual) pixel values provided at the 
output of unit 542 as a result of DPCM processing. A non-inverting input (+) of unit 542 receives input pixel values 
delayed 64 pixel intervals by a 64-pixel delay element 555. The inverting input (-) of unit 542 receives predicted pixel 
values from predictor 550. The residual pixel value output from unit 542 is subjected to quantization and inverse quan- 
tization by units 556 and 558 respectively. The quantization provided by unit 556 is fixed and guarantees a desired fixed 

so amount of data compression. For example, to achieve 50% compression of an 8-bit data word, unit 556 removes the 
last four least significant bits. The amount of fixed compression is not less than the desired amount of compression. 
Units 556 and 558 operate under control of a Min/Max comparison network 560 which determines the minimum and 
maximum pixel values for each pixel block. 

Quantizer 556 could also be arranged to use a fixed quantizer rule. However, it is more efficient to adapt the quan- 

55 tizer rule according to the minimum and maximum pixel values associated with the block being processed. Min/Max 
comparison unit 560 determines these values. Element 555 provides the time delay needed for the minimum and max- 
imum values of all 64 pixels of a given block to be examined before the first pixel of the block is processed. 

Referring back to Figure 3, compressor 322 has no inherent delay, but the combination of the min/max comparison 
and delay element 555 (Figure 5) causes compressor 322 to exhibit a one block delay, which matches the one block 
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pression by unit 32, the resolution of image information from memory 20 is reconstituted by unit 33 using a pixel repeat 
up-sampling process. The up-sampling process is not required between display decompressor 34 and display proces- 
sor 26 (Figure 1) since processor 26 will provide and required horizontal sample rate conversion. It is expected that 
decompressor 34 and processor 26 will not perform upsampling in a reduced cost receiver because of the reduced dis- 

5 play resolution provided by such a receiver. In such case memory reduced decoded frames have higher resolution than 
a standard definition display. For example, to decode and display a 1920 x 1088 pixel video sequence on a 720 x 480 
pixel display device requires that images stored in frame memory have a resolution of 960 x 1088 (with horizontal dec- 
imation by a factor of two). Thus decompressor 34 does not need to upsample images, but display processor 26 will 
have to downsample the 960 x 1088 resolution image to 720 x 480 to be suitable for display. 

io Figures 9 and 10 respectively illustrate the general arrangement of elements associated with the pixel decimation 
and upsampling process. In unit 29 the original pixels are first low pass filtered by an even order low pass filter 1010 
before being decimated by two, whereby every other pixel value is removed by unit 1012. These pixels are stored in 
memory 20. Afterwards, pixel data from memory 20 are repeated by element 1014 of upsampling unit 33 using well 
known techniques. When unit 29 is bypassed, the input to unit 1010 is re-routed directly to the output of unit 29 under 

is microprocessor control. This switching can be implemented by a variety of known techniques. 

It is noted that unit 29 uses only horizontal decimation within the decoding loop rather than both horizontal and ver- 
tical decimation. The use of horizontal decimation alone advantageously eliminates artifacts which would be produced 
by vertical decimation of an interlaced video field. The horizontal decimation process produces no spatial shift and little 
or no degradation due to multiple passes through the decoder loop. This benefit is obtained by using an even order lew 

20 pass filter 1010 (Figure 10) before decimation, and by using a simple pixel repeat process as the up-conversion mech- 
anism. An even order filter with more than two taps crosses macroblock boundaries, i.e., such low pass filter is not 
restricted to intra macroblock processing. This yields true horizontal spatial lowpass filtering. The simple pixel repeat 
operation used in the up-conversion process generally has a poor frequency response as an interpolator. However, any 
degradation of the frequency response occurs on the first pass through the loop. Multiple passes produce insignificant 

25 additional loss due to the pixel repeating process. 

In this example filter 1 01 0 is an 8-tap symmetrical FIR filter. This filter operates in the horizontal spatial domain and 
filters across block boundaries. The 8-tap filter has the effect of shifting the relative position of the output pixels by one- 
half sample period relative to the input, as shown in Figure 9. As also shown in Figure 9, the pixel repeat up-sampling 
has the effect of maintaining the same spatial position of the downsampled/upsampled pixels relative to the original pix- 

30 els. 

The number of passes through the decoder loop, (in this case two) is determined by the number of B frames 
between I or P anchor frames. Decimation filter 1012 may be a two-tap filter so that for input pixels a and b the filter 
output is (a+b)/2 , and decimation is accomplished by dropping every other pixel. This filter does not cross the block 
boundary, is easy to implement, and is a good choice for horizontal decimation. 

35 Pixel repeat up-conversion is used because when pixel repeating upsampling is combined with an averaging dec- 
imation filter, the pixels will remain invariant for a multipleisass decimation and upsampling process. Thus subsequent 
passes through the decoder loop do not change the pixel value. Illustratively low pass filtering by simply averaging a 
pair of pixels, followed by decimation and pixel repeat, produces the first time through the loop. However, in the second 
pass the low pass filter (which averages two pixels) amounts to averaging a pair of repeated pixels. This yields the same 

40 pixel, which in turn is repeated again. Up-sampling preferably should exhibit simple, fast operation since it is in the 
important motion compensation loop. 

Referring to Figure 1 1 , display processor 26 receives input data from decompressor 34 via a display buffer network 
including parallel FIFO buffers 1 110 and 1 1 12 and a multiplexer 1 1 14. In Figure 1 1 blocks 20, 34 and 26 correspond to 
similarly labeled blocks in Figure 1 . The previously described block based compression/decompression operation is 

45 well suited for memory access needed to support MPEG decoding, and is complemented by the display buffer network 
to support display processing. The display buffer network holds sixteen image lines, divided among eight line buffers 
1 1 1 0 and 1 1 12. Uncompressed data for display processing is read from one of the buffers via multiplexer 1114 while 
the other buffer is being filled with decompressed data from unit 34. In this example buffers 1 1 10 and 1 1 1 2 are located 
in memory unit 20. 

50 Figure 1 2 depicts the arrangement of Figure 1 in the context of a practical digital signal processing system in a tel- 
evision receiver. The Figure has been simplified so as not to burden the drawing with excessive detail. For example, not 
shown are FIFO input and output buffers associated with various elements, read/write controls, clock generator net- 
works, and control signals for interfacing to external memories which can be of the extended data out type (EDO) or 
synchronous (SDRAM) type. 

55 Elements in Figure 12 that are common to Figure 1 are identified by the same reference number. The elements 
shown in Rgure 12, except for elements 29-34, correspond to elements found in the STi 3500A MPEG-2/CCIR 600 
Video Decoder integrated circuit commercially available from SGS-Thomson Microelectronics. Motion processor 22 
may employ the STi 3220 Motion Estimator Processor integrated circuit also commercially available from SGS-Thom- 
son Microelectronics. Briefly, the system of Figure 12 additionally includes a microprocessor 1220, bus interface unit 
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mation without decimation. 
4. A system according to claim 1 , wherein 

5 said decimation means provides horizontally decimated image information and processes vertical image infor- 

mation without decimation; and 

said decompressing means includes an inverse Discrete Cosine Transform (DCT) processor operative without 
discarding DCT coefficients. 

w 5. A system according to claim 3, wherein 

said decimation means includes an even order low pass filter; 

said image information is horizontally decimated by a factor of 2; and 

said compression means provides pixel block compression. 

is 

6. A system according to claim 1 , wherein 

said decompressing means and said data reduction means are situated in an integrated circuit; and 
said memory means is located external to said integrated circuit. 

20 

7. A system according to claim 1 , wherein 

said decimation means selectively decimates information from said decompressing means to produce deci- 
mated information; and 

25 said compression means selectively compresses said decimated information or said decompressed informa- 

tion. 

8. A system according to Claim 1 , comprising: 

30 input means for receiving a datastream of compressed image representative MPEG coded information; 

means for decompressing said image information to produce decompressed information; 
motion information processing means for processing said decompressed information, said motion processing 
means including said data reducing means; 

said memory means storing data reduced information from said motion processing means; and 
35 said output means includes image processing means. 

9. A system according to claim 8, wherein 

said data compressing means controllably exhibits predetermined compression factors; and 
40 said data decimation means provides horizontal image decimation and is selectively enabled and disabled. 

10. A system according to claim 8, wherein 

said compression means re-compresses said decompressed information to produce recompressed informa- 
45 tion; 

said data decimation means decimates said recompressed information to produce data reduced information; 
and 

interface means receives said data reduced information via a first path and conveys said data reduced infor- 
mation to said memory means via a second path. 

so 

11. A system according to claim 10, wherein 



said data reduction means is situated in an integrated circuit; 
said first path is a data bus internal to said integrated circuit; and 
55 said memory means is located external to said integrated circuit. 

12. A system according to claim 10, wherein said motion information processing means further includes 

data decompressing means for receiving data from said memory means; and 
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(54) Multi-mode MPEG decoder 



(57) A television receiver with an MPEG decoder is 
configurable for full high definition decoding and display, 
or reduced cost lower definition display. The MPEG de- 
coder (10-33) uses a controllable dual-mode data re- 
duction network selectively employing horizontal detail 
reduction (29) and data re-compression (30) between 



the decoder and the decoder frame memory (20) from 
which image information to be displayed (27) is derived. 
The amount of data reduction is manufacturer selected 
in accordance with the resolution of the display device, 
e.g., equal to or less than high definition resolution. The 
frame memory size is also manufacturer selected in ac- 
cordance with the resolution of the display device. 
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